\documentclass[a4paper, 12pt, twocolumn]{article}

\usepackage {epsfig}

\usepackage[icelandic, english]{babel}
\usepackage[utf8]{inputenc}
\usepackage{graphicx}
\usepackage{times}
\usepackage{fancybox}
\usepackage[T1]{fontenc}
\usepackage{fancyhdr}
%\pagestyle{fancy}
%\headheight 35pt

% Title Page
\title{Advanced Database Systems\\ Project 4}
\author{Ari Þór H. Arnbjörnsson, Ívar Björn Hilmarsson,\\ 
	and Stefán Örn Finnbjörnsson\\ 
	\{ari06, stefanf06, ivarh07 \}@ru.is} 
\date{\small 19 November, 2008\\
\begin{figure}[!b]
  \begin{center}
    \includegraphics[scale=0.5]{RU-logo-EN.eps}
  \end{center}
\end{figure}
}
\usepackage{lipsum}

\begin{document}

\twocolumn[%
%  Title and authors
   \begin{center}
     {\huge\sffamily Advanced Database Systems\\ Project 4}\\
      \vspace{2ex}
      By \textsc{Ari Þór H. Arnbjörnsson\footnotemark[1], Ívar Björn Hilmarsson\footnotemark[2],\\ 
	and Stefán Örn Finnbjörnsson\footnotemark[3]}
   \end{center}
%  Abstract
   \begin{quotation}
   \begin{center}
   \textbf{Abstract}
   \end{center}
      \small
	\noindent In this assignment we focus on optimizing the code from project 2 and 3,\\
	This papaer concerns improving and finetuning the code of a cluster based partitioning and search method.
   \end{quotation}
   \begin{quotation}
   \begin{center}
   \textbf{Acknowledgement:}
   \end{center}
   \small
	\noindent We want to thank Gylfi Þór Guðmundsson for all the help and the hours spent on debugging and specially for all the trash talk.
   \end{quotation}
   \vspace{2ex}%
]

\footnotetext[1]{ari06@ru.is}
\footnotetext[2]{ivarh07@ru.is}
\footnotetext[3]{stefanf06@ru.is}
% Article body

%\maketitle
%\thispagestyle{empty}
%\newpage

% \begin{abstract}
% \noindent In this assignment we focus on optimizing the code from project 2 and 3,\\
% \noindent This papaer concerns improving and finetuning the code of a cluster based partitioning and search method.
% \end{abstract}
% %\thispagestyle{empty}
% %\newpage
% 
% \renewcommand{\abstractname}{Acknowledgements}
% \begin{abstract}
%  \noindent We want to thank Gylfi Þór Guðmundsson for all the help and the hours spent on debugging and specially for all the trash talk.
% \end{abstract}
% %\thispagestyle{empty}
% %\newpage
% 
% \tableofcontents
%\pagenumbering{roman}
%\newpage
%\pagenumbering{arabic}
\section{Introduction}
In the Advanced Database Systems course we were given a project in four parts. In the first part we had to implement sequential scan. For the second part we change the first part to use cluster-based searching for muliple local query descriptors. For the third part we implemented an efficient clustering algorithm that was supposed to improve the query results significantly. For the fourth part we are experimenting with the code from part two and three trying to gain more speed and also trying to loose as little as possible from the quality of the search. 
%\newpage
\section{Project}
The project built on this semester was in four parts, 
our goal was to create a fairly efficient cluster-based high-dimensional search engine for
local descriptors.\\

In the first part, we were to create a search engine based on sequential scan
that used a simple index structure which we could use throughout the semester.\\

In the secound part, we were to change the first part to use cluster-based searching
for multiple local query descriptors.\\

In the third part, we implemented an efficient clustering algorithm.
That clustering algorithm was supposed to improve the query results significantly.\\

Now in the fourth part, we are experimenting with the code from part two and three.
we are using several methods to gain more speed while clustering, where we also have to 
consider the quality of the output.
we are also trying to gain more speed while searching, but our main goal was to improve
the clustering time.
\section{Methods and Hardware}
For developing and testing each version we used a Packard Bell laptop running linux, for full details see section ~\ref{hw}, \ref{os}, and \ref{gcc}.
\subsection{Hardware}
\label{hw}
Type: Packard Bell\\
Processor: Intel Centrino duo T930 (2.5GHz)\\
Memory: 4GB DDR\\
\subsection{Operating System}
\label{os}
OS: SuSE Linux 11.0\\
Kernel: Linux 2.6.25.16-0.1-pae i686\\
\subsection{Compiler}
\label{gcc}
We are using the standard OpenSource GNU 4.0 compiler. For optimization we tested both -O2 and -O3 and was -O3 giving us little bit better times on the large dataset.
\subsection{Level 1 Clustering}
We implemented level 1 clustering and used Duff's Device for the distance calculations both with temporary variables and with out them. 
\newpage
\section{Cluster Results}
Table~\ref{table:cdt} shows the overall times that each run took to cluster each dataset. The clustering technique that was used is level 1 clustering with Duffs Device in the distance calculations. The Duffs Device was using temporary variable holders. Clustering the Large dataset took us on average 1 hours 46 minutes and 33 seconds. The results in table~\ref{table:cd} are from running the same clustering technique but with minor changes to the duffs device. It is not using the temporary variables and that is giving us little bit better results. In some cases running the clustering technique with duffs device using the temporary variables should work faster. \\
The best average time was only 3 minutes faster on the large dataset which is an improvement of 2.1\%. Clustering times with no optimization can be seen in table~\ref{table:cdno}. 

The Duffs device optimization was imporiving the average times of 7.6\% on the medium dataset and 20\% on the small dataset.
\begin{table}[!h]
 \centering
   \begin{tabular}{|c||c|c|c|}	
	\hline
		\multicolumn{4}{|c|}{\small Times are in seconds}\\
	\hline
	\hline
		Clustering & \multicolumn{3}{|c|}{\bf Datasets}\\
		Runs & \bf Small & \bf Medium & \bf Large \\
	\hline
		1 & 7 & 219 & 6541 \\
	\hline
		2 & 8 & 216 & 6492 \\
	\hline
		3 & 7 & 215 & 6387 \\
	\hline
		4 & 8 & 216 & 6292 \\
	\hline
		5 & 7 & 216 & 6350 \\
	\hline
		6 & 8 & 215 & 6340 \\
	\hline
		7 & 7 & 214 & 6337 \\
	\hline
		8 & 8 & 215 & 6317 \\
	\hline
		9 & 8 & 218 & 6362 \\
	\hline
	\hline
		\bf Average & 7.56 & 216 & 6379.78\\
	\hline
   \end{tabular}
    \caption[Cluster Results]{\small Clustering resulst, using Duffs Device with temporary variables}
    \label{table:cdt}
\end{table}

\begin{table}[!ht]

 \centering
   \begin{tabular}{|c||c|c|c|}
	\hline
		\multicolumn{4}{|c|}{\small Times are in seconds}\\
	\hline
	\hline
		Clustering & \multicolumn{3}{|c|}{\bf Datasets}\\
		Runs & \bf Small & \bf Medium & \bf Large \\
	\hline
		1 & 7 & 224 & 6939 \\
	\hline
		2 & 7 & 203 & 6053 \\
	\hline
		3 & 7 & 204 & 6108 \\
	\hline
		4 & 7 & 205 & 6036 \\
	\hline
		5 & 8 & 204 & 6066 \\
	\hline
		6 & 6 & 205 &  \\
	\hline
		7 & 7 & 206 &  \\
	\hline
		8 & 7 & 203 &  \\
	\hline
		9 & 7 & 206 &  \\
	\hline
	\hline
		\bf Average & 6.22 & 206.67 & 6240.4\\
	\hline
    \end{tabular}
    \caption[Second Cluster Results]{\small Clustering resulst, using Duffs Device without temporary variables}
    \label{table:cd}
 \end{table}
\begin{table}[!htb]

 \centering
   \begin{tabular}{|c||c|c|c|}
	\hline
		\multicolumn{4}{|c|}{\small Times are in seconds}\\
	\hline
	\hline
		Clustering & \multicolumn{3}{|c|}{\bf Datasets}\\
		Runs & \bf Small & \bf Medium & \bf Large \\
	\hline
		1 & 8 & 241 & 6927 \\
	\hline
		2 & 8 & 220 & 6396 \\
	\hline
		3 & 8 & 222 & 6424 \\
	\hline
		4 & 7 & 222 & 6474 \\
	\hline
		5 & 8 & 222 & 6474 \\
	\hline
		6 & 8 & 221 &  \\
	\hline
		7 & 8 & 219 &  \\
	\hline
		8 & 7 & 221 &  \\
	\hline
		9 & 8 & 219 &  \\
	\hline
	\hline
		\bf Average & 7.78 & 223 & 6539 \\
	\hline
    \end{tabular}
    \caption[no Duff Cluster Results]{\small Clustering resulst, with no optimization}
    \label{table:cdno}
 \end{table}
 \newpage
\section{Searching}
We took the search techniqus from the given project2 solutions and implemented Duffs device to the distance calculations. That was the search technique that we used for testing no further implementation was done on that part.
\section{Discussion}
\section{Conclusion}

\end{document}          
